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Abstract 

This paper analyzes the spectral properties of Tyler's M-estimator for scat- 
ter T. It is shown that if a multivariate sample stems from a generalized 
spherically distributed population and the sample size n and the dimension 
d both go to infinity while d/n — > 0, then the empirical spectral distribution 
of VW^(T - /) converges in probability to the semicircle law, where / is 
the identity matrix. In contrast to that of the sample covariance matrix, this 
convergence does not necessarily require the sample vectors to be componen- 
twise independent. Further, moments of the generalized spherical population 
do not have to exist. 
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1. Introduction 

The spectral analysis of large dimensional random matrices has become 
an active field of research during the last decades because of its broad applica- 
bility to many practical problems such as wireless communications, statistics 
and finance. A main tool of this analysis is the empirical spectral distribu- 
tion function (ESD) of a d— dimensional matrix A having real eigenvalues. 
Denote by \\{A) < . . . < Xd{A) the eigenvalues of A. Then, the ESD of A is 
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denned as 

d 



F A (x) = ^l i - 00 , x] (\i(A)), 

where 



d i 



l, x e M 

0, else 



ies of A 

assuming a certain distribution of these entries, the ES P of A may converge 



for some set M. If the entries of A are random variables, then, as d — > oo and 



to a non-random limit in some sense. IWignerl (119551 ) investigates special 
random matrices in the context of quantum mechanics, so-called Wigner 
matrices, whose expected limiting ESD is the well known semicircle law, a 
continuous distribution function with density 

dw(x) = - a; 2 l[_ 2 ,2](^) dx . (1) 

In statistics, the limiting behavior of the ESD of large dimensional sample 
covariance matrices has attained much interest. These matrices are of the 
type 

n 

where Xj = (Xij, . . . ,XdjY, 1 < j < n, is a centered d— dimensional sam- 
ple. Assuming that X^ (0,1), the almost sure limit of the ESD of 
S under the asympto t ics n, d —> oo and d/n — )■ y G (0, oo) is derived in 
Marcenko and Pastur fll967t ). the famous Marcenko-Pastur law. It is given 



by a distribution function F y (x) satisfying 

dF y (x) = (l - d5 (x) + f y (x) dx, (2) 



where 



fy( x ) = 7T—V( a + -x){x- a_)l[ _, a+ ](x), 



27cxy 

a± = (1 ± y/y) 2 and 5q is the Dirac delta function in 0. Further, a suitable 
standardization of S yields the semicircle law as n, d — > oo, d/n — > y = 0. 
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More precisely, iBai and Yir] fll988[ ) show that if X {j (0, 1) and E|X n | 4 < 



oo, then, as n,d — > oo and <i/n — > 0, the ESD of 

converges almost surely to the semicircle law ([1]). Here, / denotes the identity 
matrix. 

There is research about the question of how the strict assum ption of an 
i denti cal and independent distribution of the Xij can be weakened. IBai and Zhang 



(120071 ) give an overview that a Lindeb erg-type condition for the can re- 
place an identical distribution. Further, they show that the convergence of 
the ESD of S* to the semicircle law is still valid if S* is sparse which means 
that some entries of S* are missing in a certain sense. But still the assump- 
tion of horizontal and vertical independence of the X^ is required. We show 
in this paper that if we do not use the sample covariance matrix but Tyler's 
M-estimator to estimate the true covariance matrix of a standard normal 
population and standardize this estimator in the same manner as S, then 
the associated ESD also converges almost surely to the semicircle law (TfJ). 
Moreover, it is shown that this convergence holds even if the population is 
generalized spherically distributed. Since the components of a generalized 
spherical population are not independent (except in the case of normality), 
the condition of vertical independence of the X^ is weakened. We see further 
that certain moment conditions can be relaxed as well. 

The next section briefly provides all necessary information about shapes 
matrices and their connection to Tyler's M-estimator so that the main results 
can be stated. Section [3] will then give the proof of the results which is 
followed by a small outlook for future research. 

2. Tyler's M-estimator 

The shape matrix of a d— dimensional population X is defined as the 
symmetric and positive definite solution Q = Q(X) £ IR dxa! of the equation 

where /i £ JBL d is the center of the distribution of X and VL~ 1 / 2 denotes the 
symmetric root of the inverse Q^ 1 . Note that fi does not have to be the 
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expec tation of X but may also be its median (see also iFrahm and Jaekel 
(120091 ). Section 3). In the following, only \i = will be of interest so that we 
will not further discuss this point. Equation (j3J) determines Q uniquely up to 
a scalar multiple so that a suitable scaling is ne eded. We cho ose tr(fi) = d, 
where tr denotes the trace operator, and refer to iFrahml (120091 ) for a detailed 
discussion about shape matrices and their scales. 
If X is spherically distributed, i.e., 



X^RUW, 



(4) 



where R > is a scalar random variable being independent from ~ 
U(S d ~ l ), where U(S d ~ l ) denotes the uniform distribution over the unit sphere 
in M. d , then Q(X) = = / with respect to the center /i = 0. We see 

that the shape matrix does not depend on R. Especially, if R = 
where x\ denotes a random variable which is x 2 distributed with d degrees 
of freedom, we obtain that X ~ N^O, I), i.e., X is standard normal so that 
the covariance and shape matrix of X agree. 

Now, we assume that \i is known and set w.l.o.g. \i = 0. The shape 
matrix of X can be estimated by its sample counterpart Cl as the solution of 



i « n-^XjXfi- 1 ^ 

n xfl^Xj 



or, equivalently, of 



n 



d 



n 



E 



xpq 



,xp- 



(5) 



where Xi, . . . ,X n is a sample drawn from X. The es timato r Cl is known 



as Tyler's M-estimator for scatter and is introduced by iTylerl ( 119871 ). which 
is why we set T = Q for short. Again, we choose the scaling tr(T ) = d 
in order to uniquely define T as it is also proposed by ITylerl ( 119871 ) . The 



existence of T is assured if n > d (see iKent and Tyler! ( I199ll ). Lemma 2.1.) 



and its computation can be done by performing an iteration scheme leading 
to a n unique sym metric and positive definite solution of Equation (jSJ) (see 
also iTvleri ( Il987j )). 

Since Q(X) = Q(RX) for a scalar random variable R, this invariance 

is inherited to T = T(X) meaning that T(X) = T(Y) if X = RY for 
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a d— dimensional random variable Y . This property is quite appealing for 
elliptical populations (with center 0), i.e., 

X = RAU {k \ 



•dxk anc j 7^ > o is a scalar random v ariable being independent 
U(S k - 1 ) (see also iFang et~aD (Il990f ). Chapter 2). Hence, T is 



where A 6 
from • 

dis tribution- free within the class of elliptical distributions. 

Frahm (l2004t ) introduces the class of generalized elliptical distributions 



whose members have the same stochastic representation as an elliptical ran- 
dom variable. But this class additionally allows for R < and dependence 
between R and U^ k ' . These feat ures are quite useful wh en dealing with fi- 
nancial data as it is mentioned in iFrahm and Jaekell ( 20091) . It is clear that T 
is also distribution-free within that class. iFrahml ( 12004 ). Chapter 4, derives 
T as a maximum likelihood estimator for Q(X) assuming that X is gener- 
alized elliptically distributed. Thus, T has many desired properties such as 
consistency and asymptotic normality. 

T yler's M-estim ator has another outstanding property concerning robust- 



ness. iTylerl ( 119871 ) shows that T is the "most robust" estimator for the shape 
matrix of an elliptical population. This means that if X is elliptical, then 
the maximum asymptotic variance of T is a minimum within the set of max- 
imum asymptotic variances of all consistent and asymptotically normally 
distributed shape matrix estimators. 

Following the idea of generalized elliptical distributions, we define the 
class of generalized spherical distributions as the set of all random variables 
X having the stochastic representation (J?]), where R ^ is a scalar random 
variable and ~ U(S d ^ 1 ). Similar to generalized elliptical distributions, 
R and may depend on each other and R may also take negative values. 
Clearly, we have Q(X) = I with respect to the center fi = if X is gener- 
alized spherically distributed. This class of distributions is of interest in the 
following theorem. 

Theorem. Let X\, . . . ,X n be an i.i.d. sample drawn from a generalized 
spherical population X of dimension d. Let T be Tyler's M-estimator being 
normalized so that tr(T) = d. Then, as n,d — > oo and d/n — > 0, 

1. the ESD of 



converges in probability to the semicircle law (Q]) and 
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2. Ai(T*) -A -2, A d (T*) -A 2 . 
p 

Here, " — > n denotes convergence in probability. Some consequences of 
the theorem are worth to point out. First, in contrast to the convergence of 
the ESD of S* to the semicircle law, it is not assumed that the components 
of X are independent. The uncorrelatedness of the components of U is 
required instead because of the distribution-freeness of T. Further, moments 
of X do not have to exist. For example, if R and U are independent and 

R — ^dFd iP , where F^ p is a F— distributed random variable with d and p 
degrees of freedom, the population has a d— dimensional t-distribution with 
p degrees of freedom. In the case of p — 1, we obtain the Cauchy distribution 
whose expectation does not exist. This is a sharp contrast to the sample 
covariance matrix which e yen requires the exi stence of the fourth moment of 
the components of X (see Bai and Yin (1988)). 



3. Proof of the theorem 

In the following, we consider n as a integer-valued function of d with 
lim^oo n(d) = oo and d = o{n) as d — > oo. So, we just write d — > oo 
for d, n — > oo and d/n — > 0. Almost sure convergence will be denoted 

a g £l 

by — > and — > stands for convergence in mean. Further, we set 1 1 ^4 1 1 2 : = 
max{|Ai(A)|, |A,i(^4)|} for a symmetric d— dimensional matrix A which may 
be deterministic or at random. In the latter case, ||v4||2 is also a random 
variable. 

First, because of the distribution-freeness of T, we may assume w.l.o.g. 
that X ~ N d (0,I), i.e., R = \/xI being independent from U^. We will 
now prove the first part of the theorem by applying the moment convergence 
theorem (MCT) with respect to that population. The m— th moment of the 
ESD of T* is given by 



J x m dF r >) = iX>r(T*) = itr([T 

i=i 



Since the semicircle law is uniquely defined by its moments (because its 
support is compact), the MCT is applicable. It says that the convergence of 

^tr([T*] m ) -A I x m dw(x) 
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for every fixed m G N as d — > oo is sufficient for 



F (x) — > w(x) 



as d — > oo (see also the introduction in lBai and Silversteinl ( 120101 )). We will 
show that 



]m ) - ~tr([S* 







(6) 



as d — > oo. Since we have from iBai and Yinl (119881 ) that 

x m dw(x) 



itr([ST)-^ 



as d — )■ oo, the triangle inequality leads to the result. Now, we need four 
small propositions. 

Proposition 1. Let X ~ N d (0, 1) . Then: 

p 



IT* - S* 



as d — )■ oo 



— >■ oo 



Proof. iDumbgenl (119981 ) shows in Theorem 5.4.: 

E||T- 5|| 2 = o (yd/nj as d 
It follows that 

E||T* - 5*|| 2 = yjnjd E||T - 5|| 2 = o(l) as d -»■ oo 
which means 

||T — 5 || 2 — as d — > oo 
which implies the assertion. 

Next, we have: 

Proposition 2. Let A,B be d— dimensional symmetric matrices. Then: 
Vi<i< d : |Ai(A)-Ai(B)| < P-5|| 2 



□ 
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Proof. The g eneraliz e d We yl inequality says that \i(A) + \ d+1 _i(B) < X d (A + 
B) (see, e.g., ISchottl ( 120051 ) . Theorem 3.23.). Then, we have either 

< \{A) - \{B) = Xi(A) + Xa+i-ii-B) < X d (A — B) < \\A - B\\ 2 
or 

< \i{B) - Xi(A) = Xi(B) + X d+1 -i(-A) < X d (B - A) 
= -Xx(A-B) < \\A-B\\ 2 ■ 



□ 



Proposition 3. Let X ~ N d (0,I). Then, we have that 



\S* 



2 as d —> oo . 



Proof. FromlDettd ( 120021 ). Corollary 2.2., it follows that Xi{S*) - 
X d (S*) 2 as d -> oo. 

Now, define the interval 

B % := |aA,(T*) + (1 - a)A,(5*) | a e [0, 1]} . 

Proposition 4. Again, let X ~ N d (0,I). Then, we have that 

p 

sup |A| — > 2 as ti — > oo . 

AeUti ^ 

Proof. From the definition of it holds that 

^AeUli-Bi^ 1 ^'^'^ ' 1 ] ' ^ = a ^j(T*) + (1 — a)Aj(S'*) 

= a(A i (T*)-A i (S'*)) + A i (S 1 *) 

from which follows: 

|A| < a \ Xj(T*) - Xj(S*) \ + \Xj(S*)\ 
<a\\T* - S*\\ 2 + \\S*\\ 2 

as d — > 00 using Propositions [[J |2] and [3l 



—2 and 
□ 



□ 
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Now, we can estimate (151) as follows: 



itr([TT)-itr([S7 



1 d 

<iEi A r( T *)-Ar(s* 



W 1 



i=l 



< - Vm sup lA" 1 - 1 ! |Aj(T*) - A,(S*)| 

<||T*-S*|| 2 (Prop. 



<m \\T* - S*\\ 2 -Y] sup | A |' 



m—l 



^ SUP A 6U f =1 sJ A | m_1 



m—l 



< m \\T* — S*\\2 1 sup 



>0 (Prop. [TJ ' 



>2 ™-i (Prop. H) 



as — >■ oo. The inequality (*) is due to the mean value theorem for contin- 
uously differentiable functions. All in all, the convergence in ([6]) is shown, 
which completes the proof of the first part of the theorem. The second part 
of the theorem is a simple consequence of the preceding results. We have 
that 

|Ai(T*) + 2| < \Xi(T*) - Xi(S*)\ + \Xx(S*) + 2| 

< ||T*-S*|| 2 + |A 1 (S*) + 2| AO, 
\Xd(T*) - 2| < \X d (T*) - X d (S*)\ + MS*) - 2| 

< \\T* - S*\\ 2 + \X d (S*) -2\ AO 



as d — > oo using Propositions [T] and [2] and Corollary 2.2. from iDettd ( 120021 ). 
which completes the proof of the theorem. 



4. Outlook for future research 



Yin and Krishnaiahl ( 119851 ) show that, as n, d — > oo and d/n — > y G 



(0, 1), the ESD of S converges in probability to a non-random limit if the 
population is spherical. This limiting distribution is described by its moments 
and is unequal to the Marcenko-Pastur law unless the population is standard 
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normal. In contrast, iFrahm and Jaekell (120081 ). Section 3.2., give evidence 
that the limiting ESD of T equals the Marcenko-Pastur law (j2J) under these 
asymptotics if the population is generalized spherical. Regarding the proof 
in Section [3J this conjecture will be shown if one proves that \\T — S\\2 — > 
for a standard normal population as n, d — > oo and d/n — >■ y G (0, 1). Note 
that considering y > 1 is not possible beca use T doe s not e xist for n < d. 
An analysis of the proof of Theorem 5.4. in Dumbgenl ( 1998 ) may provide a 
solution to this problem. 

Another concern is the establishment of the almost sure convergence of 
the ESD of T and T*. Here, we need additional results on the second mo- 
ments of ||T — S\\2 and \\T* — S*\\2- For example, if one could show that 
Yar(\\T* - S*\\ 2 ) = C>(gH 1+5 )) as d ->■ oo for some S > 0, then the almost 
sure convergence of the ESD of T* to the semicircle law would follow by the 
Borel-Cantelli lemma. 



Acknowledgements 

The authors are very grateful to Lutz Diimbgen, Karl Mosler and David 
E. Tyler for their helpful comments and suggestions. 



References 
References 

Bai, Z., Silverstein, J.W., 2010. Spectral Analysis of Large Dimensional 
Random Matrices. Springer, New York. 2nd edition. 

Bai, Z.D., Yin, Y.Q., 1988. Convergence to the semicircle law. Ann. Probab. 
16, 863-875. 

Bai, Z.D., Zhang, L.X., 2007. Semicircle law for Hadamard products. SIAM 
J. Matrix Anal. Appl. 29, 473-495. 

Dette, H., 2002. Strong approximation of eigenvalues of large dimensional 
Wishart matrices by roots of generalized Laguerre polynomials. J. Approx. 
Theory 118, 290-304. 

Diimbgen, L., 1998. On Tyler's M-functional of scatter in high dimension. 
Ann. Inst. Statist. Math. 50, 471-491. 



10 



Fang, K.T., Kotz, S., Ng, K.W., 1990. Symmetric Multivariate and Related 
Distributions. Chapman and Hall, London. 1st edition. 

Frahm, G., 2004. Generalized Elliptical Distributions: Theory and Applica- 
tions. Ph.D. thesis. University of Cologne. Department of Economic and 
Social Statistics, Germany. 

Frahm, G., 2009. Asymptotic distributions of robust shape matrices and 
scales. J. Multivariate Anal. 100, 1329-1337. 

Frahm, G., Jaekel, U., 2008. Tyler's M-estimator, random matrix theory and 
generalized elliptical distributions with applications to finance. Working 
paper, Department of Statistics and Econometrics, University of Cologne, 
Germany. 

Frahm, G., Jaekel, U., 2009. A generalization of Tyler's M-estimators to the 
case of incomplete data. Comput. Statist. 54, 374-393. 

Kent, J.T., Tyler, D.E., 1991. Redescending M-estimates of multivariate 
location and scatter. Ann. Statist. 19, 2102-2119. 

Marcenko, V.A., Pastur, L.A., 1967. Distribution of eigenvalues for some 
sets of random matrices. Math. Sb. 72, 457-483. 

Schott, J., 2005. Matrix Analysis for Statistics. Wiley & Sons, New York. 
2nd edition. 

Tyler, D.E., 1987. A distribution-free M-estimator of multivariate scatter. 
Ann. Statist. 15, 234-251. 

Wigner, E.P., 1955. Characteristic vectors of bordered matrices with infinite 
dimensions. Ann. of Math. 62, 548-564. 

Yin, Y.Q., Krishnaiah, P.R., 1985. Limit theorem for the eigenvalues of the 
sample covariance matrix when the underlying distribution is isotropic. 
Theory Probab. Appl. 30, 861-867. 



11 



